Home:ALL Converter>Not able to connect to Hadoop Server using HadoopFileSystem in pyarrrow

Not able to connect to Hadoop Server using HadoopFileSystem in pyarrrow

Ask Time:2022-04-21T01:38:11         Author:Aman Jain

Json Formatter

I am trying a python code in which I am using pyarrow and trying to make connection to hadoop server using fs.HadoopFileSystem(host=host_value, port=port_value) but everytime I am getting an error message:

    self.parquet_writer = HDFSWriter(host_value='hdfs://10.110.8.239',port_value=9000)
    File "/app/aerial_server.py", line 54, in __init__
        self.hdfs_client = fs.HadoopFileSystem(host=host_value, port=port_value)
    File "pyarrow/_hdfs.pyx", line 89, in pyarrow._hdfs.HadoopFileSystem.__init__
    File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status
    File "pyarrow/error.pxi", line 114, in pyarrow.lib.check_status
    OSError: HDFS connection failed

env variables

    PYTHON_VERSION=3.7.13
    HADOOP_OPTS=-Djava.library.path=/app/hadoop-3.3.2/lib/nativ
    JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
    HADOOP_INSTALL=/app/hadoop-3.3.2
    ARROW_LIBHDFS_DIR=/app/hadoop-3.3.2/lib/nativeHADOOP_MAPRED_HOME=/app/hadoop-3.3.2
    HADOOP_COMMON_HOME=/app/hadoop-3.3.2
    HADOOP_HOME=/app/hadoop-3.3.2
    HADOOP_HDFS_HOME=/app/hadoop-3.3.2PYTHON_PIP_VERSION=22.0.4
    CLASSPATH=/app/hadoop-3.3.2/bin/hdfs classpath --glob
    HADOOP_COMMON_LIB_NATIVE_DIR=/app/hadoop-3.3.2/lib/native
    PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/app/hadoop-3.3.2/sbin:/app/hadoop-3.3.2/bin
    _=/usr/bin/env

Author:Aman Jain,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/71943956/not-able-to-connect-to-hadoop-server-using-hadoopfilesystem-in-pyarrrow
yy